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Abstract 

There is currently a great deal of interest in the 4D-Var data assim- 
ilation scheme, in which one uses observational data to find the optimal 
initial condition for a differential equation by minimizing a cost function 
over the set of all possible initial states. For nonlinear models this cost 
function can be nonconvex, and so the uniqueness of minimizers is not 
guaranteed. In this paper we apply 4D-Var to Burgers' equation and 
prove that, once a sufficient amount of data has been collected, there can 
be at most one physically reasonable minimizer to the variational problem. 

1 Introduction 

We consider here the problem of finding the "best" initial condition u for Burg- 
ers' equation with Dirichlet boundary conditions 

Vt + yyx = ty X x (i) 
y(0,t)=y(l,t)=Q 
y(x,0) = u(x) 

in the presence of noisy observational data. It is known that a unique solution 
to the forward problem exists for any initial state u in the space V := Hq(0, 1), 
with norm defined by \\u\\y = J Q u^dx. We denote this solution by y(u). Since 
Burgers' equation can be transformed to the heat equation via the Cole-Hopf 
transformation, we have that y(u) £ C°°([0, 1] x (0, oo)). The observational data 
are assumed to be given by a bounded linear observation operator H : V — > Z , 
into a Hilbert space Z. 

Given continuous (in time) observational data z(t) and a fixed maximal 
observation time T, we define a cost functional 

J T {u)= [ \\Hy(u)-z\\ 2 z dt + f3\\u-u\\ 2 v , (2) 
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where u is a fixed background term and j3 is a positive constant. Our goal is then 
to understand the existence of minimizcrs for Jt — functions uq € V satisfying 

J T (u ) =inf{J T (u) :ti£ 7}. (3) 

White [3] has established the following fundamental results. 

Theorem 1 (White '93). There exists a solution of f3J) for every T > 0. More- 
over, there is a positive constant T\ = T±(e,/3,z, \\H\\) such that the solution is 
unique provided T < T\ . 

White's proof shows Jt in fact has a unique critical point for T sufficiently 
small, which is a stronger result than is stated above as it rules out the possibility 
of multiple local minima. The result is not expected to hold for all times T > 
due to the nonlinearity present in (JTJ). However, we have been able to show 
that, for sufficiently large observation times, one in fact regains uniqueness of 
the minimizer. 

Theorem 2. Suppose the observational data satisfy \\z(t)\\z < c for all t > 0. 
Then for each K > there exists a constant T2 — Ta(e, /3, c, K, \\H\\) such that, 
for any T > T2, the functional Jt has at most one critical point satisfying 

The result says that a bounded set in L 2 (0, 1) can contain at most one 
solution of ((3| once T is large enough. This conclusion is satisfying as far as 
applications are concerned, since u typically represents a quantity subject to 
certain physical constraints. For instance, a predicted initial state in which the 
fluid velocity u is, on average, greater than the speed of light, would be of little 
practical use. 

One can easily obtain an a priori bound on minimizers as Jt{ u o) < ^t(O), 
whence \\u$ — u\\y < (3~ 1 Z(T) 2 + \\u\\y, where we have defined the integrated 
data function Z(T) 2 = L \\z(t)\\%dt. However, this estimate only appears to 
be useful for small T unless one makes rather strong assumptions concerning 
the asymptotic behavior of Z(T). This explains the L 2 -bound on u required in 
Theorem^ which is not necessary for TheoremQ]to hold. Long-time uniqueness 
can be obtained in the absence of the bound ||m|| < K if one assumes Z(T) = 
o(T). Physically this means that the measurement error decreases over time, 
e.g. \\z(t)\\z — 0(t~ s ) for any S > 0. Since this scenario seems unlikely to arise 
in practice, we do not dwell on it any longer. 

We finally observe that Theorem [5] holds more generally for polynomially 
bounded data, Z(T) = 0(T N ), with the required observation time Ti then 
additionally depending on N. However, the most physically relevant case is 
that of uniformly bounded observations with ||^(t)||z < c, hence Z(T) < cVT. 
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2 Fixed-point interpretation of the Euler— Lagrange 
equation 

In this section we review the Euler-Lagrange equation for Jt , and show that it 
can be interpreted as a fixed-point equation for a nonlinear compact operator 
St '■ V V . From [3] we know that the derivative of Jt evaluated at u, in the 
direction v £V, satisfies 

^DJ T (u)(v) = (p(0), v)+/3 ((« - u) x ,v) 

= (p(Q)-P(u-u) xx ,v), (4) 

where p(0) denotes the solution to the adjoint equation 

-pt-PxV-£Pxx=-[H*(Hy-z)] xx (5) 
p(0,t)=p(l,t)=0 
p(T) = 

evaluated at time t = 0. Thus u is a critical point of Jt precisely when 
p(0) = j3(u — u) xx . Note that this is a nonlinear equation for u, because p 
depends linearly on y, which in turn depends nonlinearly on u. Writing these 
dependencies more explicitly as p — p[y] and y = y(u), we have that u is a 
critical point if and only if is is a fixed point of the nonlinear map 



S T (u) =u + TjA^p [y(u)] 



(6) 

t=o 



acting on V. 

Our goal is thus to show that St is a contraction mapping under the condi- 
tions of Theorem[2] This is accomplished in the following section by establishing 
certain decay estimates for solutions of Burgers' equation. This large-time decay 
is strong enough to ameliorate the growth in the corresponding solutions of the 
adjoint equation, which is caused by the data-dependent forcing term on the 
right-hand side of equation (JSJ , and so we are able to obtain the desired bound 
on St when T is large. 



3 The contractive estimate 

To study the contractivity of St, we fix ui and in V and define v = u\ — 
$ = Vi ~ 2/2 and p = p x - p 2 , where y, := y{ui) and Pi = p[yi] for i = 1,2. 
For the remainder of the paper we let || • || denote the L 2 norm on [0,1], hence 
\\v\\v — \\ v x\\ for any v £ V. We thus wish to estimate 

\\S T (u 2 )-S T (u 1 )\\ v = h\A- 1 p(Q)\\ v 
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above in terms of ||u 2 — Ui||y = \\v x \\. It is a consequence of standard elliptic 
estimates that there is a positive constant K such that 

IIA-Vlk < K\\P\\ 

holds uniformly for all p £ V, so by the Poincare inequality it would suffice to 
establish the estimate 

||p(0)||<C(T)|M| (7) 

with some positive constant satisfying C(T) — > as T — > oo. 

A key ingredient in the following estimates is the fact that any solution 
y = y(u) of Burgers' equation is contained in L 2 (0, oo; V). More explicitly, we 
have from 'A that 

f o \WA\ 2 dt<^ (8) 

for any T > 0. 

This will frequently be used in conjunction with the following version of 
Gronwall's inequality. 



i(T) < e A ~ aaT 



a ]u(t) 


+ b(t), where 


, then 








u(0) + 


[ b{t)a aat dt 




'o 



for all T > 0. 



There are no sign restrictions of u(t), though it will always be nonnega- 
tive in our applications. If u(0) = the lemma immediately implies u(T) < 

e A Jq b(t)dt. A more subtle application will be given in the proof of Proposi- 
tion [3] 

Proof. We define the integrating factor 



H(t) = exp jy 



t 

[a(s) — ao]ds 



and observe that e~ a °* < n(t) < e A ~ aot for all t > 0. It follows that 

< b{t)e aot 



dt \n{t) 

and the proof is completed by integrating both sides of the above inequality. □ 

We now investigate boundedness of solutions to the adjoint equation ([5]), 
with Dirichlet boundary conditions and terminal condition p(T) = 0. 
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Proposition 1. There exists a constant C , depending only on e, \\u\\ and \\H\\, 
such that 

\\p{t)\\ < C^l + Z{Tf-Z{t? 
for any T > and all < t < T . 
Proof. We differentiate ||p(i)|| 2 , finding that 

1 d f f 

~2dt" P " 2 = _e / p ^ dx+ j PPxV dx + ( H P> H y ~ z )z 

^ f \i\\H\\ i AallylU \ ||2 ||y||oo „ l|2 , \W\\ ,, „ ||2 

for any positive Ai and A 2 . We then choose Ai = 2e/3|jff|| and A 2 = 26/3112/1100, 
so that 

_| M .<(2!dk_>Ar) w . + ?!|!! |jr ,_„.. 

We are now in a position to appiy Lemma [T] to the function p(t) := p{T — t), 
because 

l°°\\y(t)\\l,dt<^ 

by Equation (J8)). It follows that 

\\p(t)\\ 2 <C £ \\Hy-z\\ 2 z dt, 

where C depends on e, ||tt|| and ||-ff||. We finally estimate \\Hy — z|| 2 < 
2||iJ|| 2 ||j/ I || 2 + 2112111 and integrate, again applying Equation ([8j to conclude 
that the first term is bounded. □ 

We next estimate the difference 6 = y\ — y 2 between two solutions of Burgers' 
equation with initial values u\ and u 2 , respectively. 

Proposition 2. There exist a constant A, depending only on e, \\u\\\ and ||w 2 ||, 
such that 

||«5(t)ir<A|| Ul - U2 || 2 e— 2t 

/or f > 0. 

Proof. We first compute 

5* = yu - 2/2* 
= eyixx - yiyix - (ey2xx - yiyix) 

= - y2)z :E + 2/22/2* - J/lZfez + 2/l2/2z ~ J/lJ/lz 

= e&cz " <5y2a; - 2/1 $z 
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and hence find 

~|||£|| 2 < -ell-Mi 2 +2||y 2 || DO ||5||||4|| + \\yi\\oo\\S\\\\5 x \\ 

< (axIMU + \\Sf + + ^ - e) H^ll 2 

for any positive Ai and A2. We choose Ai = 4|| y 2 1| 00/e and A2 = 2| 1 7/1 1| oo/e, with 
the result that 

at \ e e J 

We then apply Lemma Q] with b(t) = to obtain the desired result. □ 

We now come to the main estimate of this section, which will yield the 
contractivity of St for sufficiently large values of T. 

Proposition 3. There exists a constant A, depending only on e, \\H\\, \\u\\\ 
and \\u2W , such that 

||p(0)|| < A (1 + T) (1 + Z{Tf) K - u 2 \\e-^ T 

for any T > 0. 

Proof. Differentiating as in the previous sections (c.f. Proposition 3.2 of [3]), 
we obtain 

+ A 2 ||yi|| 00 || /9 || 2 + {\i\\H\\ 2 + 2X 3 \\p 2 \\) \\S X \\ 2 . 

for any positive Ai, A 2 and A 3 . We choose Ai = 3||if|| 2 /e, A 2 = 3112/iHoo/e and 
A3 = 6||p2||/e, and hence find that 

d„.„a^ fnvifoo „A „.„ a , ^3||i/|| 4 , Jp 2 \\ 



dr 



lMr< M^-^ iHr + H^ + e 



r 



Since p(T) = 0, it follows from Lemma Q] and Proposition [T] that 
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||p(0)|| 2 < Ce-" T / (1 + Z(T) 2 )e- * f| ^(*) || 2 rft. (9) 



To bound the integral term, we first recall the inequality 



, 2||y 1 || 2 \ ||x||2 d I|Jf||a 



< I + ^^111^-^11*11 



G 



found in the proof of Proposition [21 Then, up to a constant, 

d 



\\S X \\ 2 < (||y 2 .|| 2 + ||j/i,|| 2 ) hi-w 2 || 2 e 



dt >l " 



Upon inserting this expression into Equation (|9]) we have 

' T (l + Z(T) 2 )e^ 2t \\S x (t)\\ 2 dt <C(1 + Z{Tf)\\ Ul - u 2 \\ 2 

(l + Z{tf)e™ H \\5{t)\ 
<- T d 



o dt 



(l + Z(t) 2 ) 



2^ £ e^t 



\S(t)\\ 2 dt, 



where we have integrated by parts and used the fact that Z(t) is nondecreasing. 
The boundary term arising from the integration by parts is bounded above by 
||5(0)|| 2 = 1 1 Tii — u 2 \\ 2 - For the last term we have 



dt 



(1 + Z(t) 2 ) e^ 2 * \\8{t)\\ 2 < C (1 + ||z(t)||| + Zitf) \\u x - uaH 



and the result follows. 



□ 



4 Implications for data assimilation 

In this section we relate the previous analysis to the widely-used 4D-Var data 
assimilation scheme. In 4D-Var one is interested in minimizing a cost functional 
of the form 

N 

Jad{u) := \\ H V( u )\t=n ~ Zl \\z + ^ u ~ "Hv" 

2=1 

which is defined in terms of a finite set of observations z\, . . . , zn, taken at times 
ii, . . . , i/v- The observation space Z is usually taken to be R m with a weighted 
Euclidean norm: 

Nli = l|ir 1/2 *llLc 

where R is the observational error covariance matrix. 

The minimization problem is typically solved using a gradient descent (or 
conjugate gradient) method, where the gradient can be evaluated using Equation 
(jlj, or a discrete analog for the case of finitely many observations (see pQ for 
details). Thus each evaluation of DJ requires integration of the forward model 
([1]), followed by integration of the adjoint model ([5]) backwards in time. 

This approach, while guaranteed to produce a minimizer for Jt, will not nec- 
essarily find the global minimum, even under the assumption that it is unique. 
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Only in the event that J has a unique local minimum can we ensure the con- 
vergence of a steepest descent algorithm to this desired minimum. It is thus of 
great interest to know when this uniqueness condition is satisfied. 

For the continuous-time functional, Jt, it was shown in [3] that this is true 
for sufficiently small observation times T, but also observed that this property 
likely fails for larger T, due to the nonlinearity of the forward model u t— ¥ y(u), 
and the resulting nonconvexity of Jt- We have shown in this paper that unique- 
ness also holds for sufficiently large T, given some additional, physically reason- 
able, assumptions on the set of potential minimizers. This arises as a result of 
the large-time exponential decay exhibited by solutions to Burgers' equation, 
which essentially demonstrates the dominance of the linear diffusive term over 
the nonlinear advective term, in the large-T limit. These results, however, do 
not immediately apply to the uniqueness problem for the 4D-Var cost func- 
tional, as they assume continuous-times observations — a technical convenience 
that is of course impossible to realize in the real world. 

Thus an important next step is to modify the present analysis so that it is 
applicable to the case of discrete observations, i.e. the true 4Z?-Var minimization 
problem. It will also be of great interest to apply these techniques to problems 
in higher spatial dimensions, and with more complicated nonlinearities. 

While the present study has only considered the inverse problem for Burgers' 
equation from the variational perspective, there is in fact a great deal more in- 
formation encoded in the cost function J than just the location and uniqueness 
of extrema. There is an equivalent Bayesian formulation of the inverse prob- 
lem in which J corresponds to the log-likelihood of the posterior distribution 
P(u|z) (see [2] for details). In this interpretation, minimizers of J correspond to 
modes of the posterior distribution, but of course there is much that one can say 
about a probability distribution beyond its modal structure. For applications, 
covariance information is of great importance, as it gives an indication of how 
trustworthy our solution to the inverse problem is. The nonlinearity of Burg- 
ers' equation means that higher moments of the posterior distribution will also 
contain nontrivial information, as would any other measure of the global shape 
of the distribution. It is in this formulation that we hope to better understand 
the limiting behavior of solutions to the inverse problem for dynamical systems 
with more severe nonlinearities and more complicated asymptotic structure. 
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